{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Database Control\n",
    "\n",
    "The Database Manager Notebook oversees commands related to defining database settings for setting up and performing calculation workflows.  At least one database needs to be defined in order to perform the calculation workflows.  In particular, this Notebook allows for \n",
    "\n",
    "- Defining databases,\n",
    "- Specifying the local run_directories where calculations will be placed/performed,\n",
    "- Copying/uploading reference records from the library to the databases,\n",
    "- Checking the number and status of records within a database,\n",
    "- Cleaning records in a database by resetting any that issued errors, and\n",
    "- Copying/removing database records."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Global workflow details:__\n",
    "\n",
    "The commands offered by this Notebook are outside the global workflow, with the exception that new databases can be defined here before use in the other Notebooks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Library imports**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "iprPy version 0.10.2\n"
     ]
    }
   ],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "# https://github.com/usnistgov/iprPy\n",
    "import iprPy\n",
    "print('iprPy version', iprPy.__version__)\n",
    "\n",
    "settings = iprPy.Settings()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Define databases\n",
    "\n",
    "Settings for accessing databases can be stored under simple names for easy access later."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **list_databases()** function returns a list of all of the names for the stored databases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['master', 'iprhub', 'master_local', 'library_local', 'potentials']\n"
     ]
    }
   ],
   "source": [
    "print(settings.list_databases)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **set_database()** function allows for database access information to be saved under a simple name."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.1 Mongo database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Database master already defined.\n",
      "Overwrite? (yes or no): no\n"
     ]
    }
   ],
   "source": [
    "# Define local mongo database to save files to\n",
    "settings.set_database(name='master', style='mongo', host='localhost', port=27017, database='iprPy')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "database style mongo at localhost:27017.iprPy\n"
     ]
    }
   ],
   "source": [
    "master = iprPy.load_database('master')\n",
    "print(master)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 CDCS database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Database potentials already defined.\n",
      "Overwrite? (yes or no): no\n"
     ]
    }
   ],
   "source": [
    "# Specify remote CDCS database to save files to\n",
    "host = 'https://potentials.nist.gov/'\n",
    "user = 'lmh1'\n",
    "pswd = 'XXXXXXXXXXXXXXXXX'\n",
    "\n",
    "# Define mdcs database called iprhub\n",
    "settings.set_database(name='potentials', style='cdcs', host=host, user=user, pswd=pswd)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "database style cdcs at https://potentials.nist.gov/\n"
     ]
    }
   ],
   "source": [
    "remote = iprPy.load_database('potentials')\n",
    "print(remote)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3 Local database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Database master_local already defined.\n",
      "Overwrite? (yes or no): no\n"
     ]
    }
   ],
   "source": [
    "# Define local database called master_local\n",
    "host = 'e:/calculations/ipr/master'\n",
    "settings.set_database(name='master_local', style='local', host=host)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "database style local at E:\\calculations\\ipr\\master\n"
     ]
    }
   ],
   "source": [
    "local = iprPy.load_database('master_local')\n",
    "print(local)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Define run directories\n",
    "\n",
    "The high-throughput calculations are prepared and executed using local directories.  The paths to these directories can be saved and stored using simple names for easy access later."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **list_run_directories()** function returns a list of all of the names for the stored run directories."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['master_1', 'master_2', 'master_3', 'master_4', 'master_5', 'master_6', 'master_7', 'master_8']\n"
     ]
    }
   ],
   "source": [
    "print(settings.list_run_directories)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **set_run_directory()** function allows for a local run directory to be saved under a simple name. For best functionality, each run_directory should be for a unique database and number of cores."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define running directories for up to four cores\n",
    "torun_root = Path('e:/calculations/ipr/torun')\n",
    "dbname = 'master'\n",
    "\n",
    "for i in range(8):\n",
    "    settings.set_run_directory(name = f'{dbname}_{i+1}', \n",
    "                               path = Path(torun_root, dbname, f'{i+1}'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **load_run_directory()** function accesses the stored directory path associated with a run directory's name."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "E:\\calculations\\ipr\\torun\\master\\1\n"
     ]
    }
   ],
   "source": [
    "run_directory = iprPy.load_run_directory('master_1')\n",
    "print(run_directory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "database = local"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Build database by copying reference records into it\n",
    "\n",
    "The **build_refs()** method copies the reference records in iprPy/library to the database for use in high-throughput calculations.\n",
    "\n",
    "Parameters\n",
    "\n",
    "- __lib_directory__ (*str or path, optional*) The directory path for the library.  If not given, then it will use the iprPy library directory.\n",
    "- __refresh__ (*bool or list, optional*) If False (default) only new reference records are added.  If True, all existing reference records are refreshed by deleting the current ones in the database and uploading the references in lib_directory.  If a list is given, then only the reference record styles named in the list are refreshed.\n",
    "- __include__ (*str or list, optional*) The reference record style(s) to copy to the database.  If not given will upload all record styles found in lib_directory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "database = master\n",
    "#refresh = False\n",
    "refresh = False\n",
    "include = [\n",
    "    'crystal_prototype',\n",
    "    'dislocation',\n",
    "    'free_surface', \n",
    "    'point_defect',\n",
    "    'potential_LAMMPS',\n",
    "    'reference_crystal',\n",
    "    #'relaxed_crystal',\n",
    "    'stacking_fault',\n",
    "]\n",
    "#refresh = include = 'dislocation'\n",
    "\n",
    "database.build_refs(refresh=refresh, include=include)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Check record numbers and status"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **check_records()** method checks how many records of a given style are stored in the database.  If the record is a calculation record, it will also display how many are unfinished, issued errors, or have successfully finished."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In database style mongo at localhost:27017.iprPy:\n",
      "- 333 of style potential_LAMMPS\n"
     ]
    }
   ],
   "source": [
    "database.check_records('potential_LAMMPS')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "database.check_records('calculation_phonon')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Copy records between databases\n",
    "\n",
    "The **copy_records()** method copies records from the current database to another database.  Either a list of records \n",
    "\n",
    "Parameters\n",
    "        \n",
    "- **dbase2** (*iprPy.Database*) The database to copy to.\n",
    "\n",
    "- **record_style** (*str, optional*) The record style to copy.  If record_style and records not given, then the available record styles will be listed and the user prompted to pick one.  Cannot be given with records.\n",
    "\n",
    "- **records** (*list, optional*) A list of iprPy.Record objects from the current database to copy to dbase2.  Allows the user full control on which records to copy/update.  Cannot be given with record_style.\n",
    "\n",
    "- **includetar** (*bool, optional*) If True, the tar archives will be copied along with the records. If False, only the records will be copied. (Default is True).\n",
    "\n",
    "- **overwrite** (*bool, optional*) If False (default) only new records and tars will be copied. If True, all existing content will be updated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "947 records to try to copy\n",
      "947 records added/updated\n",
      "947 tars added/updated\n"
     ]
    }
   ],
   "source": [
    "source_database = local\n",
    "dest_database = master\n",
    "\n",
    "record_style = 'calculation_E_vs_r_scan'\n",
    "#records = source_database.get_records(...)\n",
    "\n",
    "source_database.copy_records(dest_database, \n",
    "                             record_style=record_style,\n",
    "                             #records=records,\n",
    "                             includetar=True,\n",
    "                             overwrite=True,\n",
    "                            )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "record = local.get_record(name='0a31c1dc-0377-406e-a6b4-36843e4cd39c')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "tar = local.get_tar(record=record, raw=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "master.add_tar(record=record, tar=tar)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Clean calculation records\n",
    "\n",
    "The **clean_records()** method resets errored calculations of a specified record style.  Cleaning a record style means:\n",
    "\n",
    "- Resetting any calculations that issued errors back into a run_directory\n",
    "\n",
    "- Removing any .bid files in the calculation folders in the run_directory\n",
    "\n",
    "This is useful to resetting and rerunning calculations that may have failed for reasons external to the calculation's method.  E.g. runners terminated early, parameter conflicts for a limited number of potentials, debugging calculations.\n",
    "\n",
    "__WARNING:__ Conflicts may occur if you clean a run_directory that active runners are operating on as the .bid files are used to avoid multiple runners working on the same calculation at the same time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 records to clean\n"
     ]
    }
   ],
   "source": [
    "database.clean_records(record_style='calculation_E_vs_r_scan', run_directory='test_1')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Destroy calculation records\n",
    "\n",
    "The **destroy_records()** method deletes all records of a specified style.  Useful if you want to reset any library records or rerun calculations with different parameters. \n",
    "\n",
    "**WARNING:** This is a permanent delete even for local database styles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "#database.destroy_records('calculation_surface_energy_static')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Forget database information\n",
    "\n",
    "The **unset_database()** and **unset_run_directory()** functions will remove the saved settings for the databases. \n",
    "\n",
    "**NOTE:** Only the stored access information is removed as the records in a database and files in a run_directory will remain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Clear out existing definitions\n",
    "#iprPy.unset_database(name='demo')\n",
    "#iprPy.unset_run_directory(name='demo_1')\n",
    "#iprPy.unset_run_directory(name='demo_2')\n",
    "#iprPy.unset_run_directory(name='demo_3')\n",
    "#iprPy.unset_run_directory(name='demo_4')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
